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Abstract 

Many applications in cellular systems and sensor networks involve a random subset of a large 
number of users asynchronously reporting activity to a base station. This paper examines the problem 
of multiuser detection (MUD) in random access channels for such applications. Traditional orthogonal 
signaling ignores the random nature of user activity in this problem and limits the total number of users 
to be on the order of the number of signal space dimensions. Contention-based schemes, on the other 
hand, suffer from delays caused by colliding transmissions and the hidden node problem. In contrast, this 
paper presents a novel pairing of an asynchronous non-orthogonal code-division random access scheme 
with a convex optimization-based MUD algorithm that overcomes the issues associated with orthogonal 
signaling and contention-based methods. Two key distinguishing features of the proposed MUD algorithm 
are that it does not require knowledge of the delay or channel state information of every user and it 
has polynomial-time computational complexity. The main analytical contribution of this paper is the 
relationship between the performance of the proposed MUD algorithm in the presence of arbitrary or 
random delays and two simple metrics of the set of user codewords. The study of these metrics is then 
focused on two specific sets of codewords, random binary codewords and specially constructed algebraic 
codewords, for asynchronous random access. The ensuing analysis confirms that the proposed scheme 
together with either of these two codeword sets significantly outperforms the orthogonal signaling-based 
random access in terms of the total number of users in the system. 
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I. Introduction 

Many applications of wireless networks require servicing a large number of users that share limited 
communication resources. In particular, the term random access is commonly used to describe a setup 
where a random subset of users in the network communicate with a base station (BS) in an uncoordinated 
fashion |[T1. In this paper, we study random access in large networks for the case when active users transmit 
single bits to the BS. This so-called "on-off" random access channel (RAC) HI represents an abstraction 
that arises frequently in many wireless networks. In third-generation cellular systems, for example, control 
channels used for scheduling requests can be modeled as on-off RACs; in this case, users requesting 
permissions to send data to the BS can be thought of as transmitting I's and inactive users can be thought 
of as transmitting O's. Similarly, uplinks in wireless sensor networks deployed for target detection can 
also be modeled as on-off RACs; in this case, sensors that detect a target can be made to transmit I's 
and sensors that have nothing to report can be thought of as transmitting 0's{^ 

The primary objective of the BS in on-off RACs is to reliably and efficiently carry out multiuser 
detection (MUD), which translates into recovery of the set of active users in our case. The two biggest 
impediments to this goal are that (i) random access tends to be asynchronous in nature, and (ii) it is quite 
difficult, if not impossible, for the BS to know the channel state information (CSI) of every user. Given 
a fixed number of temporal signal space dimensions N in the uplink, the system-design goal therefore 
is to simultaneously maximize the total number of users M in the network and the average number of 
active users k that the BS can reliably handle without requiring knowledge of the delays or CSIs of the 
individual users at the BS. 

Traditional approaches to random access fall significantly short of this design objective. In random 
access methods based on orthogonal signaling, the signal space dimensions are orthogonally spread 
among the M users in either time, frequency, or code yj . While this establishes a dedicated, interference- 
free channel between each user and the BS, this approach ignores the random nature of user activity 
in RACs. Therefore, by its very structure, random access based on orthogonal signaling dictates the 
relationship k < M < N. On the other hand, contention-based random access schemes such as ALOHA 
and carrier sense multiple access (CSMA) do take advantage of the random user activity [3]. However, 
significant problems arise in these schemes when the average number of active users k and/or the total 

'The focus of this paper is on servicing a large number of users that share limited communication resources in the uplink. 
Limiting ourselves to on-off RACs in this case helps up isolate the key issues associated with designing arbitrary RACs involving 
(multiple-bit) packet transmissions in large networks. 



June 22, 2011 



DRAFT 



3 



number of users M gets large Q. In the case of ALOHA, collisions and retransmissions accumulate to 
significant delays as k becomes large. In the case of CSMA, the number of potential "hidden nodes" 
grows as M increases, resulting in unintended and unrecognized collisions in large networks. 

Cellular systems, partly because of the aforementioned reasons, typically resort to the use of matched 
filter receivers on uplink control channels. Such receivers correspond to single-user detection (SUD) since 
they detect each user independently, treating the interference from other active users as noise. However, 
despite the effectiveness of these receivers in today's cellular systems, SUD schemes also have significant 
pitfalls. In particular, such schemes tend to have suboptimal performance since they do not carry out 
joint detection and they tend to be prone to the "near-far" effect ||2l. 

In order to overcome the issues associated with orthogonal signaling, contention-based methods and 
SUD schemes, we present in this paper a novel code-division random access (CDRA) scheme that spreads 
the uplink communication resources in a non-orthogonal manner among the M users and leverages the 
random user activity to service significantly more total users than A^. A key distinguishing feature of 
the proposed scheme is that it makes use of a convex optimization-based MUD algorithm that does not 
require knowledge of the delays or CSIs of the users at the BS. In addition, we present an efficient 
implementation of the proposed algorithm based on the fast Fourier transform (FFT) that ensures that its 
computational complexity at worst differs by a logarithmic factor from an oracle-based MUD algorithm 
that has perfect knowledge of the user delays. Our main analytical contribution is the relationship between 
the probability of error P^rr of the proposed MUD algorithm in the presence of arbitrary or random 
delays and two metrics of the set of codewords assigned to the users. We then make use of these 
metrics to analyze two specific sets of codewords, random binary codewords and specially constructed 
(deterministic) algebraic codewords, for the proposed random access scheme. Specifically, we show that 
both these codewords allow our scheme to successfully manage an average number of active users that 
is almost linear in A^: A; < A^/(t log (Mr)) for arbitrary delays and k < A^/ log (Mr) for uniformly 
random delays, where r denotes maximum delay in the network. More importantly, we show that the set 
of random codewords enable our scheme to service a number of total users that (ignoring r) is super- 
polynomial in A^, M < exp(0(A^^/^)), while the set of deterministic codes, which facilitate efficient 
codeword construction and storage, enable it to service a number of total users that is polynomial in A^, 
M < A^* for any reasonably sized t > 2j^ 

It is useful at this point to also consider non-orthogonal code-division multiple access (CDMA), 

^Recall the "Big-0" notation: f(n) = 0{g(n)) (alternatively, /(n) < g{n)) if 3 Co > 0, tio : V n > Uo, fin) < Cog{n). 
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which — Uke our scheme — also spreads the upUnk communication resources in a non-orthogonal manner 
among M > N users HI. However, despite similarities at the codeword-assignment level, there are 
significant differences between non-orthogonal CDMA and the work presented here. First, non-orthogonal 
CDMA is used for applications in which a fixed set of users continually communicate with the BS, 
whereas our scheme corresponds to a random subset of users in a large network communicating single 
bits to the BS. Second, MUD schemes for non-orthogonal CDMA require that the BS has knowledge of 
the individual user delays, whereas we assume — partly because of the random user activity — that user 
delays are unknown at the BS. 

In terms of related prior work, Fletcher et al. ||2l have also recently studied the problem of MUD in 
on-off RACs. However, the results in |2| — while similar in spirit to the ones in here — are limited by 
the facts that 0: (i) assumes perfect synchronization among the M users, which is hard to guarantee 
in practical settings for large M ; (ii) assumes that CSIs of the individual users are available to the BS 
in certain cases, which is difficult — if not impossible — to justify for the case of fading RACs; and (iii) 
only guarantees that the probability of error Pgrr at the BS goes to zero asymptotically in M, which 
does not shed light on the scaling of Perr- More recently, we have become aware of the independent 
and simultaneous work in Q and Q that also considers on-off RACs in the context of configuration in 
ad-hoc wireless networks. However, H and Q also make a synchronization assumption similar to |2j|. 
Finally, the work presented here also has implications in the area of sparse signal recovery, and it relates 
to some recent work in model selection and compressed sensing |l6]|, Q. We defer a detailed discussion 
of these implications and relationships to later parts of the paper. 

We use the following notational conventions throughout the rest of the paper. We use lowercase and 
uppercase bold-faced letters, such as x and X, to represent vectors and matrices, respectively, while we 
use (-j^ to denote transposition of vectors and matrices. The identity matrix and the all-zeros vector are 
denoted by I and 0, respectively, and their dimensions are either given by context or explicitly shown in 
subscripts. The notation J\f{m, cr^) signifies the Gaussian distribution with mean m and standard deviation 
a, binary (ibl/\/]V, In) denotes an A^-length Rademacher distribution in which each entry independently 
takes value +l/\/iV or —I/^/N each with probability 1/2, and E[-] denotes the expectation of a random 
variable. We use Pr(-) to denote the probability of an event and Pr(-|C) as the probability conditioned 
on an event C. The notation (x, y) is used to denote the inner product between vectors x and y. Finally, 
log(-) is taken as the natural logarithm throughout the paper. 

The remainder of the paper is organized as follows. In Section [Tij we introduce our system model and 



accompanying assumptions. In Section III we describe our approach to MUD for asynchronous (non- 
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orthogonal) CDRA and specify its performance for both arbitrary and random delays in terms of two 



metrics of the set of user codewords. In Section |IVj we specialize the results of Section |III] to random 
binary codewords and specially constructed algebraic codewords. We finally conclude in Section |V] by 
reporting results of some numerical experiments and discussing connections of our work in the area of 
sparse signal recovery. 

II. System Model 

In this section, we formalize the problem of MUD in asynchronous on-off RACs by introducing our 
system model and accompanying assumptions. To begin, we assume that there are a total of M users in 
the network that communicate with the BS using waveforms of duration T and (two-sided) bandwidth W; 
in other words, the total number of temporal signal space dimensions (degrees of freedom) in the uplink 
are = TW. In this paper, we propose that users communicate using spread spectrum waveforms: 

N-l 

Xi{t) = ./£i^xl,g{t-nT,), t€[0,T), (1) 

n=0 

where g{t) is a unit-energy prototype pulse (J \g{t)\'^dt = 1), Tc ^ ^ is the chip duration, Si denotes 
the transmit power of the i-th user, and 

iT 

Cq Xi ... 

is the A^-length real-valued codeword of unit energy (||xj||2 = 1) assigned to the i-th user. 

In the context of on-off RACs, we assume that on average a total of k of the M users transmit I's at 
time t = (without loss of generality), resulting in the following received signal at the BS 

M 

y{t) = ^Y^hi5iXi{t - n) +w{t). (3) 

i=l 

Here, /ij G M and Tj G IR+ are the channel fading coefficienl^and the JeZa)]^ associated with the i-th user, 
respectively, w{t) is additive white Gaussian noise (AWGN) introduced by the receiver circuitry, and {6i} 
are independent 0-1 Bernoulli random variables that model the random activation of the M users in the 
sense that Pr((5j = 1) = k/M. Finally, we assume that user transmissions undergo independent fading 
and each hi has a symmetric distribution on R (e.g., Rayleigh fading with hi distributed as J\f{0,pf)). 

^We take fading coefficients in R since we are assuming real-valued codewords. Modifications for tfie complex-valued case 
are tedious but straightforward. 

''One of the major differences between |2l, E) and the setup in here is that it is assumed in (2), ID that maxi j (n — Tj) < Tc 
whereas we do not make this assumption since it is nearly impossible to satisfy this condition for large-enough values of M. 



1,...,M (2) 
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Next, we define the individual discrete delays t- E as t- J and define tlie maximum discrete 

delay r G Z4. in the system as an upper bound on the delays satisfying r > maxj t[. It is easy to see 
that the received signal y{t) at the BS can be sampled at the chip rate to obtain an equivalent discrete 
approximation 

M 

y ~^^hi5i\/~£i±i + ^, (4) 

i=l 

which tends to be quite accurate as long as point sampling is employed and g{t) is close to being a square 
pulse. Here, the AWGN vector is distributed as J\f{QN+T,'^N+T), the instantaneous received signal 
to noise ratio (SNR) of the active users is <Sj|/ijp, and the vectors Xj G M^+'^ are defined as 

i = l,...,M. (5) 



X,; 



0^ 0^. 



The assumptions we make here are that (i) the maximum delay r is known at the BS and (ii) each user 
has knowledge of the SNR at which its transmitted signal arrives at the BS (in other words, the i-th 
user knows \hi\). Both these assumptions are quite reasonable from a practical perspective; in particular, 
if one assumes that the BS transmits a beacon signal before the users start transmitting then the last 
assumption follows because of reciprocity between the downlink and uplink. 

Our goal now is to specify a MUD algorithm for this asynchronous CDRA scheme that returns an 
estimate X of the set of active users X = {i : 5i = \} from the {N + r) -dimensional vector y without 
knowledge of the set of delays {r^'} or the set of channel coefficients {hi} at the BS. Note that a 
benchmark for any such algorithm is synchronous, orthogonal signaling-based random access, which 
dictates the relationship k < M < N. Therefore, the primary objective of our algorithm must be to 
successfully manage an average number of active users that is almost linear in A^, but also service a total 
number of users in the uplink that is significantly larger than N. In addition to this primary objective, we 

def ^ 

are also interested in specifying probability of error, Pgrr = Pi'(X / X), and providing a low-complexity 
implementation of the MUD algorithm. In the next section, we propose an algorithm that explicitly takes 
advantage of the random user activity in the network to successfully meet all these objectives. 

III. Multiuser Detection Using The Lasso 

In this section, we propse a MUD algorithm for asynchronous CDRA that is based on the mixed- 
norm convex optimization program known as the lasso IH. The lasso was first proposed in the statistics 
literature for linear regression in underdetermined settings. In 121, the lasso has been suggested as a 
potential method for MUD in synchronous on-off RACs. However, extending the ideas of 121 to the 
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asynchronous case using the standard lasso formulation seems very difficult. In contrast, while the MUD 
algorithm proposed in this section is based on the lasso, we present a rather nonconventional usage of 
the lasso that is specific to the problem at hand. One of our major contributions indeed is establishing 
that this formulation is guaranteed to yield successful MUD with high probability. The fact that further 
differentiates our work from |i2J is that we relate the performance of the proposed MUD algorithm for 
both arbitrary and random delays to two simple metrics of the set of user codewords, which enables us to 
construct specialized codewords for different applications. The analysis carried out in this regard might 
also be of independent interest to researchers working on configuration (neighbor discovery) in ad-hoc 
wireless networks and sensor networks. These results also have connections with the area of sparse signal 



recovery, as noted in Section IV- A and Section V-B 



A. Main Results 

In order to make use of the lasso for MUD in asynchronous on-off RACs, we first rewrite 



as 



Xl X2 



/3 + w, 



(6) 



where the i-th entry of the vector P G M*^ is described as /3j hi6i\/£i. While Q appears superficially 
similar to the standard lasso formulation, we cannot use the lasso to obtain an estimate of the set of 
active users I from Q since the (A^ + r) x Af matrix X in ([6]) is unknown due to the asynchronous 
nature of the problem. In order to overcome this obstacle, we first define {N + r) x (r + 1) Toeplitz 
matrices X, as 



X,; 



X,; 



0. 



1,...,M, 



(7) 



Or Xj 

and observe that we can equivalently write Q in the form 



Xl X2 



X 



M 



(3j /3j ...Pl 



(8) 



X/3 



where X is now a (A^ + r) x M(r + 1) known matrix, which we term the expanded codebook. The 
vector /3 G is a concatenation of M vectors, each of length (r + 1), whose entries are given 

by /3j j = /3jl|^'=j_i|, « = !,..., M, j = 1, . . . , r + 1. We make use of this notation to describe the 
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proposed lasso-based MUD algorithm for asynchronous CDRA in Algorithm 1 



Algorithm 1 Multiuser Detection in Asynchronous On-Off Random Access Channels Using the Lasso 
Inputs 

1) The chip-rate sampled vector y 

2) Set of A'^-dimensional codewords {xjj^j^ 

3) Maximum discrete delay r in the uplink 

4) A regularization parameter A for the lasso 

Construct the expanded codebook X described in (|8]l using {xj} and r 

/3 ^ argmin ^||y - Xbjl^ + A||b||i (LASSO) 

beRM(T+i) 

T ^ {i: ||3,||o>0} 

Return X as an estimate of the set of active users X 



We next state the main results of this section, which bound the probability of error of Algorithm [T] 
Here we present MUD guarantees for arbitrary codebooks, parametrized by two metrics of the expanded 
codebook X. The first is the worst-case coherence of the expanded codebook, defined by 

fi(X)'^= max |(xjj,Xj'j/)| (9) 

where Xjj denotes the j-th column of the Toeplitz matrix Xj. In words, the worst-case coherence is the 
largest inner product between any two codewords with arbitrary shifts. The second metric is the spectral 
norm of the expanded codebook: ||X||2 \J Xmax (X'^X). 

Theorem 1: Suppose that users in the network become active according to independent and identically 
distributed (iid) Bernoulli random variables such that Pr(5i = 1) = k/M, and the users have transmit 
powers satisfying 

^ ^ 1281og(MVf+l) 

oi > |T-|2 , ^ e -L- (10) 

I '^i I 

^Algorithm [I] acts as a hybrid between the standard lasso and the group lasso |9|. Specifically, it is clear from the problem 
formulation that the group lasso is ill-suited for the specified MUD problem since each of the sub- vectors {(3^} in l[8j has at 
most one non-zero entry. On the other hand, we are only interested in detecting the active users and need not estimate their 
delays; hence, the group nature of the detection criterion in the definition of X. 
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Then, with A = 2\J 2 log(M\/r +T), Algorithm jlj successfully carries out multiuser detection with 

Perr < (27r log(MV^Tl )) + 5 (M(r + 1)) ^ + SAf-^ i°g 2 ^^^^ 

M<=*<^±iM5)n („) 

T + 1 

k < 7 — „., ■ (12) 

- clog(M(T + l))||X||2 

Here, the constant c > is independent of the problem parameters. 

Remark 1: Notice that Theorem [T] requires the transmit powers of all active users to satisfy ( [TO] ). This 
could lead to unrealistic demands on the transmit powers of users with very small fading coefficients. 
There is, however, a straightforward extension of Theorem [T] that handles such situations by requiring 
users with small-enough fading coefficients to remain inactive. Since the required analysis in that case 
can be carried out by using well-known techniques for computing outage probabilities, we have chosen 
to forgo a detailed discussion of this issue for brevity of exposition. 

The proof of this theorem is provided in Appendix |A] From ( [TT| ) we see that to accommodate a large 
number of total users M, we need codewords that result in an expanded codebook with a small worst-case 
coherence /i(X). Similarly, ([12]) shows that codewords that result in small spectral norm ||X||2 allow 
the value of k to be large. While Theorem [T] is general, it may be a bit opaque to some readers. Once 
applied to specific codewords in Theorems |3] and |4j however, favorable scaling relations between k, M 
and N become apparent. 

Note that Theorem [T] considers recovery in the presence of an arbitrary set of delays {n}. Specifically, 
the result describes average-case behavior for user activity and worst-case behavior for the set of users' 
delays. This is desirable since fixing a probability distribution on the delays restricts the applicability of 
our results to only certain classes of networks. Nonetheless, considering random delays can be desirable 
in certain cases. To this effect, we now place a probability model on the set of delays and derive result 
analogous to that of Theorem [T] Explicitly, for each i with 5i = 1, we consider Tj selected uniformly at 
random from the set {0, 1, ... , r}. In doing so, we find that the requirement on k scales more favorably 
with respect to the maximum delay r when one considers this typical-case analysis of {tj}. Note that 
while any probability model on the delays reduces applicability of the corresponding results to certain 
network settings, the uniform distribution of delays is mainly an illustrative model that is also amenable 
to analysis. 

Theorem 2: Suppose that users in the network become active according to iid Bernoulli random 
variables such that Pr((5j = 1) = k/M. Further, suppose that the delays of active users {r^ : i € X} 
are drawn uniformly at random from {0, 1, . . . , r} and the transmit powers of users satisfy ([10]). Then, 
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with A = 2\J 2 log(M-v/r +T), Algorithm jlj successfully carries out multiuser detection with P^rr < 
2M-i(27rlog(MV^Tl ))"^^^ + 7(M(r + i))-2i°s2 ^j^^^ 

T + 1 

(r + 1) 

k < 7 — ) „o - (14) 

- c'log(M(T + l))||X||2 

Here, the constant c' > is independent of the problem parameters. 

The proof of this theorem is provided in Appendix |B] Compared with Theorem [T] we note that the 

bound on the number of active users in this case scales with an additional factor of r + 1. When we 



specialize this theorem to random codewords in Section IV-A and a deterministic codeword construction 



in Section IV-B this translates to a mere log-order dependence on r. 



B. Computational Complexity 

Theorem [T] characterizes the performance of Algorithm [T] for MUD in asynchronous on-off RACs but 
fails to shed any light on its computational complexity. However, the lasso is a well-studied program 
in the statistics literature and — thanks to its convex nature — there exist a number of extremely fast 
(polynomial-time) implementations of the optimization program specified in (LASSO); see, e.g., |[T0]| . 

In this regard, computational complexity of the implementations of (LASSO) such as SpaRSA lITOl is 
determined — to a large extent — by the complexity of the matrix-vector multiplications Xb and X^y. It 
therefore seems that Algorithm [T] increases the computational complexity of the matrix-vector multiplica- 
tions from 0{NM), corresponding to the case of perfectly-known user delays [cf. Q], to 0{N M{t+\)). 
This observation, however, ignores the fact that X in Q has a Toeplitz-block structure. Specifically, if 
we write b € ]R^^('^+^) as b = X^f ... then it follows from elementary signal processing that 

M 

Xb = Y,^N\r{^N+r{^i)Qy'N+r{h^)), (15) 

i=l 

where Fn{-) and F^^{-) denote the FFT implementation of the n-point discrete Fourier transform (DFT) 
and the n-point inverse DFT of a sequence, respectively, while denotes pointwise multiplication. 
Similarly, if we use (•)[ni : 712] to denote the ni-th to n2-th elements of a vector and (•)^ to denote the 
time -reversed version of a vector, then it follows from routine calculations that V z = 1 , . . . , M, we have 

XV + 1) - r : i(T + 1)] = [F2N+r-i (x") © -F2W+.-1 {y))[N : N + t]. 

It therefore follows from the complexity of the FFT that the matrix-vector multiplications Xb and X^y 
in Algorithm [T] can in fact be carried out using only 0(A^Mlog(A^ + r)) operations as opposed to 
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TABLE I 

Average recovery times in Matlab for = 1023, M = 3072, and = 50 



Maximum delay r 


50 


100 


150 


200 


250 


Standard SpaRSA 

FFT Augmented SpaRSA 


21.2s 
54.4s 


61.6s 
53.4s 


96.2s 
84.3s 


142.6s 
98.0s 


173.0s 
78.2s 



0{NM{t + 1)) operations. This suggests that the computational complexity of Algorithm [T] at worst 
differs by a factor of log(A^ + r) from an oracle-based scheme that has perfect knowledge of X. 

This conclusion is also justified numerically from the results of several numerical experiments reported 
in Section |V] Table |l] shows typical computation times of Algorithm [T] in Matlab for various values 
of T. The standard SpaRSA recovery is faster at low values of r due to Matlab's optimized matrix 
multiplications. However, for r > 100, the advantage of the FFT-based implementation becomes apparent. 
The non-monotonicity of recovery times in the FFT augmented numerical experiments is due to the 
complex interaction between padding in Matlab's FFT implementation, numerical accuracy, and additional 
SpaRSA iterations, a detailed discussion of which is beyond the scope of this paper. Of course, for practical 
applications, optimizations are required beyond a Matlab implementation. 

IV. Codewords for Multiuser Detection 

In this section, we consider two sets of user codewords for asynchronous CDRA using Algorithm [T] The 
first is a random construction with normalized iid ±1 user codewords and the second is a deterministic 
construction based on cyclic codes. The deterministic construction has the advantage that user codewords 
can be more efficiently stored and generated. However, the random construction is more flexible with 
regard to codeword length and number of codewords available. Using Theorems [T] and |2] we show that 
(ignoring r) both sets of codewords allow the recovery of X when k < N/ log M. Furthermore, the 
random codewords allow M to be super-polynomial in while the deterministic codewords allow M 
to be polynomial in N. 

A. Random Rademacher Codewords: Guarantees 

Communication theory often uses random codewords for optimality. Furthermore, random measurement 
matrices are frequently used in sparse signal recovery. These examples inspire us to analyze randomly 
generated codewords in the context of Theorems [T] and [2] for MUD in RACs. 
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In the following, we assign each user a codeword of length that is independently generated from 
a binary(ibl/\/]V, Iat) distribution. We seek to quantify /i(X) and ||X||2 of the expanded codebook to 
specialize Theorems [T] and |2] to these random codewords. This is accomplished in the following lemmas. 

Lemma 1: Given any fixed ? > 0, the expanded codebook X of random Rademacher codewords 
satisfies /^(X) < ? with probability exceeding 1 — 2AP{t + l)^e ^ . 

Proof: The proof of this lemma is a consequence of the bound on the worst-case coherence /i of 
random Toeplitz matrices lITTl Theorem 3.5] and the Hoeffding inequality ifTlj . Specifically, we can write 

/x(X) = max i max|(xjj,Xjjv)|,max|(xjj,Xj'jv)| i. 

Furthermore, the proof of Theorem 3.5 in JTTll implies that | (xj Xjjv) | < q with probability exceeding 
1 — 4e 4 for any j 7^ j . Finally, since the product of two independent binary random variables is again 
a binary random variable, it can also be shown using the Hoeffding inequality that |(xj , Xj' j')| < ? 
with probability exceeding 1 — 2e~^ for any i ^ i' . It therefore follows from the union bound that 
/x(X) < q with probability exceeding 1 — 2M^(t + l)^e~^. ■ 
Lemma 2: The spectral norm of the expanded codebook X of random Rademacher codewords satisfies 



XII2 < 26Vt + 1 1 + \ — (16) 




with probability exceeding 1 — e s . 

Proof: We first recall that the spectral norm is invariant under column-interchange operations. Now 

def 



define $ 



Xi ... XM 



and * 



*0 *1 *r 



, where each block $j is an {N + r) x Af 



matrix that is constructed by prepending and appending $ with i rows and {t — i) rows of all zeros, 

respectively. It is then easy to see that ||X||2 = ||*||2 and H^olb = • • • = H^rlb = ll^lb- Furthermore, 

r 1 T 

we can write for any M{t + 1) -dimensional vector z 



T T T 

Zq z| ... z^ 



||Z||2 ||Z||2 ||Z||2 

W Vr+l||*||2||z||2 ^^ii^ii 

< II II = VT + 1^2, (17) 

l|z||2 

where (a) follows from the definition of ^ and the triangle inequality, while (6) follows from the Cauchy- 
Schwarz inequality. It therefore follows from the previous discussion and ( fTTj ) that ||X||2 < \/t + 1||$||2. 

In order to complete the proof, notice that $ is an x M random matrix whose entries are indepen- 
dently distributed as binary(ibl/-v/]V)- It can therefore be established, similar to fTT, Proposition 2.4], 
that ||$||2<26(1 + a/^) with probability exceeding 1 — e~^^^ . ■ 
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We now want to apply these lemmas to specialize Theorems [T] and [2] to the expanded codebook matrix 
of random codewords. We begin by noting that with ? of Lemma [1] chosen appropriately, the event 

, , /l21og (M(t + 1)) 1 
/^(X) < y ""-^ ^ \ (18) 

holds with probability exceeding 1 — 2(M(r + 1)) ^ . Furthermore, Lemma [2] implies that the event 



H lixib < 52,/^^fc±Ii (19) 



N 



holds with probability exceeding 1 — e 



Since we assume that the random codewords are assigned independently of the set of active users 



X, we can substitute ( [18] ) and ( [191 ) into ( [TTj ) and ( [121 ) while adding the failure probabilities Pr(^f) and 
Pr(C/|) to the probabilities of error in Theorems [T] and |2] via the union bound. This results in the following 
theorem. 



Theorem 3: Suppose that the M codewords {xj G M are drawn independently from a binary(ibl/ v N, 1^) 
distribution. Furthermore, let A and £i satisfy the conditions in Theorem [T] and let M satisfy 

exp (ci,.(r + l)-2/3iVi/3) 

M < ^ ^. (20) 

r + 1 

(a) For an arbitrary set of user delays, if 

C2,rN 

- (r+l)log(M(r + l))' 

then Algorithmjljsuccessfully carries out multiuser detection with Pgrr < 2M~^ (2tt log{My/T + 1 )) ^^^+ 
5(M(r + ^ 3^,^-2 log 2 + 2(M(t + + e^. 

(b) For a set of user delays distributed uniformly at random, when 

k < (22) 

- log (M(r + 1)) ' 

then Algorithmjljsuccessfully carries out multiuser detection with Pgrr < 2M^^ (27r log(M-v/T + 1 )) ^^^+ 
7(M(r + 1))-'^°^' + 2(M(r + + e^. 
Here, the constants ci^r, C2,r) c^^r > are independent of the problem parameters. 

Remark 2: It is important to note here that, instead of relying upon Theorem [T] if one were to directly 
analyze the MUD performance of Algorithm [T] for random codewords then it is possible to achieve 
the scaling k ^ N/log^ (M(r + 1)) in the case of arbitrary delays by using the results of ||7l for the 
"invertibility condition" in Appendix ^ Specifically, the work in 171 considers random Toeplitz-block 
matrices in a similarly structured problem and achieves only a poly-logarithmic dependence on r in 
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the case of arbitrary delays. The analysis in [7|, however, is not extendable to arbitrary Toeplitz-block 
matrices and further the proof techniques used in there introduce some complications related to noise 
folding. In contrast, our focus in here is to provide conditions applicable to arbitrary codewords via the 
metrics ||X||2 and /u(X), and our results for random matrices/codewords are primarily meant to be a 
demonstration of our more general results. Nonetheless, we believe that ITJ provides unique insights for 
Algorithm [T] in the case of random codewords and arbitrary delays. 

B. Deterministic Codewords: Construction and Guarantees 

Though random codewords allow our proposed scheme to service a large number of users (with respect 
to A^), deterministic codewords can have significant advantages. In particular, they tend to be much easier 
to generate and store. We will consider one such codeword construction in this section. 

Our deterministic construction uses codewords derived from algebraic error correcting codes. In par- 
ticular, we consider a cyclic code for which the codebook is closed under circular shifts of codewords. 
We use a cyclic code since we use cyclic shifts as approximations of delayed user codewords. As such, 
the full cyclic code is closely related to the expanded codebook matrix X which contains the delayed 
user codewords. To construct this relationship, we will select a subset of our cyclic code for assignment 
to users. In order to remove ambiguity when discussing both the full cyclic code and this subset, we will 
call the complete cyclic code the ambient code, while the subset assigned to users will be called the user 
codebook (i.e., the user codebook is the set {xj}). 

Our construction is parametrized by two positive integers, m and 2 < t < m/2. We will operate in 
the Galois finite field of size 2™, which we denote as GF(2™). Our code is constructed via the trace 
function TV : GF(2'^) ^ GF(2) (HI Ch. 4.8]) defined by 

m—l 

Tr(a) = a + H h a^""' = X] ' 

i=o 

Taking z as a primitive element of GF(2'") we define the element of a codeword in the ambient code 



as 



d = - ^-1— «.-^<--'] , i = 0, 1, 2, . . . , 2- - 2 (23) 



where the vector a = uq ai ■ ■ ■ at with t + 1 elements in GF(2™) indexes the codeword. 

Since z is primitive, {z^ : j = 0, 1, . . . , 2"^ — 2} is simply the set of all non-zero elements in GF(2™), 
which we denote GF(2™)*. Thus, we can equivalently enumerate the elements by x € GF(2™)* as 

C(^) = 2 ^_-^^Tr[aoX+j:U,a,x-'^^] ^ ^ ^ GF(2™)*. (24) 
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The above construction produces codewords of length 2*" — 1 and, since each of ai,i = 0, ... ,t can be 
any value in GF(2'"), there are 2™(*+^) codewords in the ambient code. 

We use a subset of the ambient code as the user codebook. We will require two conditions on the 
selected subset. The first condition restricts us to a subset where no codeword in the subset is a cyclic shift 
of another. Such a restriction is necessary since, in bounding ^(X), we will link cyclic shifts with different 
user delays. We will call such a set a cyclic restricted subset. There are many ways to create a cyclic 



restricted subset. Consider that, under the element enumeration of ([23]), a codeword and its cyclic shift by 
T elements are related as C^"*"^ = C^,, where a = [uq, ...,«(] and a' = [z'^ao, z^^ai, . . . , z^'^'~^^^'^at]. 
If, for example, we required that all codewords in our subset had ao = c for c G GF(2™)*, no codewords 
would be the shift of another. Explicitly, the codewords indexed by {a G GF(2'")* : = c} form a 
cyclic restricted subset for any c / G GF(2"^). Since we need only restrict a single entry in the vector 
a, enumerating over the remaining entries allows us to have a cyclic restricted subset of size 2"^* from 
the full 2™(*+i) ambient code. We may choose to use a smaller set, affording flexibility in choosing the 
value of M, and the set would remain a cyclic restricted subset. 

The second condition on selecting the user codebook as a subset of the ambient code is used to ensure 
we can appropriately bound ||X||2. As above, our condition will be on the set of vectors a which enumerate 
the codewords in the user codebook. To describe the condition we first define a wildcard index of the user 
codebook. We call w a wildcard index of the user codebook if, for each vector a = [ao, . . . , a^, . . . , at] 
that indexes a user codeword, the vector [ao, . . . ,c, . . . ,at] (i.e., the vector a with a^ replaced by c) 
also indexes a user codeword for each c G GF(2'"). We require that the user codebook have a wildcard 
index w such that 2^" + 1 does not divide 2™ — 1 (denoted 2^" + 1 f 2"^ — 1). A consequence of requiring 
the existence of a wildcard index is that the user codebook must be a multiple of 2™ in size. 

To summarize our construction, we assign to users codewords of the form ( [23] ). We require that the 
user codebook satisfy two conditions. The first is that it forms a cyclic restricted subset of the ambient 
code of all possible codewords. The second is that the user codebook contain a wildcard index w with 
2"" + 1 f 2"* - 1. This construction allows us to have iV = 2™ - 1 while M may be a multiple of 2™ up 
to 2™*. 

We now seek to apply Theorems [T] and [2] to this construction which requires us to bound the two metrics 
/i(X) and ||X||2. We consider each of the two metrics in turn. The metric /u(X) bounds inner products 
between any two shifted codewords in the user codebook. As discussed earlier, we will be relating the 
set of shifted codewords to the ambient code. As a result, our first goal is to bound the inner product 
of any two codewords in the ambient code. The bound can be obtained easily by exploiting properties 
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of the ambient code. By the Unearity of the trace [14, p. 116], we find that the element-wise product 
of two codewords satisfies CiC^^, = Ca+a'^^^ ^ result, the inner product between two non-identical 
codewords is simply the sum of the entries of a different codeword. This leads to an equivalent goal 
of bounding the sum of an arbitrary non-trivial codeword. That is, allowing {aj G GF(2™)}-^q to be 
arbitrary with a / 0, we attempt to bound the sum 

S= (-l)^k"+^-i"-"''^']- (25) 

xGGF(2'")* 

Lemma 3: The sum given in ([25]) satisfies \S\ < 2?+*+^/^ ^j. 

any codeword. 

This lemma is proved in Appendix |C] We leverage this result to provide a bound on /^(X). 

Lemma 4: Let the user codebook {x^} be a cyclic restricted subset of the ambient code defined by 



( [23] ). Then the worst-case coherence is bounded by 

9i+l/2+m/2 I 

/i(X) < = —. 

Proof: We are interested in bounding |(xjj-,Xj' ,,')|. We will do so by relating this inner product to 
one in the ambient code. Using the fact that Xj/j/ is of the form Q, we can replace vectors of zeros 
above and below the codeword with shifted copies of x,/ in order to make the periodic vector ^i'j' with 
period on its + r length; we call ^i'j' the periodic extension of Xj' Let A = Xj- j- — Xj' j/ be its 
difference from the original. By the triangle inequality, 

Kxij,xi'j')| < |(x,j,x,'jv)| + |(xij, A)|. 

The periodic extension has converted the shift / into a cyclic shift on the support of Xj j. Furthermore, 
since the two unshifted codewords Xj and Xj' come from a cyclic restricted subset, they are guaranteed 
to be different on the support. Thus, for the first term we use the bound of Lemma [3] divided by 2™ — 1, 
thereby accounting for the normalization of the users' codewords. The second term is bounded by the 
fact that the support of A overlaps with that of Xjj with at most r elements of value ±1/^/2*" — 1. ■ 
Having bounded ^(X), we now turn to the second metric ||X||2 and bound it using the following 
lemma. 

Lemma 5: Let the user codebook of cardinality M selected from the ambient code have a wildcard 
index w such that 2^" + 1 f 2™ — 1. Then the spectral norm of the expanded codebook X is bounded by 

M 



X|b< ^2^(^ + 1)- 



*This is a simple reformulation of the fact that the non-exponentiated version of the code in (|23^ is linear. 
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Proof: We begin the proof similarly to Lemma |2] Let $ be an x M matrix of the user codewords 
and recall from Lemma[2]that ||X||2 < ^/t + 1||$||2. We will now show that the rows of $ are orthogonal, 
such that = 2^-1 '^' which is sufficient for the proof since Amax(^*^) = Amax(^"'"*)- 



Using (24 1, the inner product between two rows, indexed by x and y in GF(2"^), is the following sum 
over the vectors a indexing the user codebook: 

a a 

1 r ™ ™ 1 TrU(x+y)+E*i-i 

= _J_ ^ (_i)Tr[a„(x- + H,-+^)] ^ L ^-l J 

a™eGF(2'") 

(26b) 



where in (26b) we have separated the wildcard index element a^, which takes every value in GF(2™), 



into a sepai^ate sum. For all a G GF(2™) we have a + a = 0. Thus, from the sum in ( 26a i, when x = y 
each term takes unit value and their sum equals M, the number of a. In the case when x 7^ y, we 
examine ([26b]). By our wildcard index condition we are guaranteed that x^'^^^ 7^ y^^^^ so that has a 



non-zero coefficient in the trace. By Proposition |5ja) in Appendix |C] precisely half the terms of the sum 
over au, are —1 and thus the whole sum evaluates to 0. Therefore, the inner product between the two 
rows evaluates to M/(2™ — 1) when x = y and to otherwise. This completes the proof of the lemma. 

■ 

Having bound both /u(X) and ||X||2, we are able to apply Theorems [T] and [2] to this deterministic 
construction and give the following recovery guarantees. 

Theorem 4: Let m and t be positive integers and let = 2™ — 1. Suppose that the M codewords 
^ are chosen from the code defined by ( [23] ) such that they form a cyclic restricted subset 
and have a wildcard index w with 2*" + 1 f 2"^ — 1. Furthermore, let A and {£i} satisfy the conditions 
in Theorem [T] and let M satisfy, 

(ci d ,1 \(-2t + l/2 + m/2 1 ^\ ) 

M < ^ (27) 

r + 1 

(a) For an arbitrary set of user delays, if 

C2,d(2"- - 1) 

- (r+l)log(M(r + l))' ^^'^ 

then Algorithmjljsuccessfully carries out multiuser detection with Perr < 2M~^ (2tt log(M a/t + 1 )) ^^'^+ 
5(M(r + l))~''°^' + 3M-2i°g2_ 
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(b) For a set of user delays distributed uniformly at random, if 

, £3,^(2"^ - 1) 
- log (M(r + 1)) ' ^'^^ 

then Algorithmjljsuccessfully carries out multiuser detection with Pgrr < 2M~^ [2tt log{M^/^^^\^ )) 

Here, the constants ci^d, C2,d, C3,d > ai^e independent of the problem parameters. 

It is important to note here that although ( |27| ) appears to allow a super-polynomial number of users 
M, our construction restricts us to at most 2™* codewords to be assigned to users. For small values of 



t, this restriction on M dominates the one in ([27]). However, as t approaches ^^V^, (p7) becomes the 



relevant bound on M. In general, comparing our deterministic construction of codewords to the randomly 



generated ones in Section IV-A[ we find that the proposed deterministic codewords have advantages in 



storage and generation while randomly generated codes have the advantages that is arbitrary and that 
M can be super-polynomial in N. 

V. Numerical Results and Discussion 

A. Monte Carlo Experiments 

To verify and illustrate the results presented in this paper for MUD in asynchronous RACs, we make 
use of Monte Carlo trials. Our numerical experiments assume a total of M = 3072 users communicating 
to the BS using codewords of length N = 1023. We report the MUD results for both the random 
codewords of Section IIV-AI and the deterministic construction of Section IIV-BI For the deterministic 
construction, a code generated with m = 10 and t = 2 is used with a subset of the ambient code of 
size M = 3(2"*) assigned to users. Random user activity is generated using independent 0-1 Bernoulli 
random variables such that Pr((^j = 1) = k/M for a given k. Furthermore, for a given maximum 
delay r, the individual user delays {rj} are generated once at random for each experiment and then fixed 
for the remainder of the experiment. The implementation of Algorithm [T] uses the SpaRSA package [IGJ 



in order to solve (LASSO) and includes the modifications described in Section III-B for speeding up the 
matrix-vector multiplications Xb and X^y. In all the numerical plots, results for random codewords are 
displayed using solid lines while those for deterministic codewords are displayed using dashed lines. 

The numerical experiments correspond to the ability of the MUD scheme proposed in Algorithm [T] to 
correctly recover the active user set T for varying values of the average number of active users k and 
maximum delay r. The results of these experiments are reported in Figure [T] which shows that when k 
is below a certain threshold, X is exactly recovered (i.e., X = X) in the vast majority of Monte Carlo 
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Fig. 1. User support recovery error rate as a function of thie expected number of active users k. 




Fig. 2. Normalized per user error as a function of the expected number of active users k. 



trials. Beyond the threshold of k ^ 50, the fraction of Monte Carlo trials in error quickly approaches one. 
Figure [T] also shows that codewords generated as described in Section IV-B| perform nearly identically 
in performance to those randomly generated. 

In order to compare our MUD results with some of the traditional SUD approaches, we have included 
numerical results corresponding to the performance of a matched filter receiver for the case of random 
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Fig. 3. Normalized per user error as a function of the received user power with r = 50. 



codewords. We assume that the SUD receiver has access to the outputs of the matched filters for all the 
M(r + 1) user codewords and shifts as well as an oracle knowledge of \1\ (which is more specific than 
knowledge of A; = E|X|). Consequently, the receiver declares the users corresponding to the |X| largest 
matched filter responses to be active. Note that, in general, any practical SUD receiver that detects using 
a fixed threshold for the matched filter responses is expected to perform worse than this oracle-like SUD. 
Despite this, we find that our proposed MUD algorithm significantly outperforms the traditional SUD 
receiver based on matched filtering ideas. 

Note that the results in Figure [T] are reminiscent of Theorem |2] and the related sections of Theorems [3] 
and |4| rather than Theorem [T] This is because results for the worst-case algorithmic performance are 
difficult to verify experimentally. While guarantees for arbitrary user delays are desirable, verifying this 
numerically would require generating all (r + l)'^' possible combinations of {tj} in each Monte Carlo 
trial. The worst-case analyses of ( |2T] ) and ( [28] ) suggest that the threshold should be inversely proportional 
to (r + 1). On the other hand. Figure [T] does not exhibit this behavior since our numerical experiments 
correspond to a random generation of the delays {tj}. Rather, they show that the recovery threshold of 
k for a typical set of {rj} is not a strong function of r. This corresponds with the results for randomly 
distributed {n} in ([22]) and ([29]). 

The recovery metric in Figure [T] matches that of the theorems and declares a trial to be in error when 
X / X. However, it is also useful in many cases to consider how far the estimate X is from the correct 
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set. Therefore, in Figure |2] we use the performance metric of average fraction of detection errors, which 
corresponds to ^^"^^"^^fc^"^^"^^^ and describes the number of errors in the estimated set of active users as a 
fraction of the average number of active users. With this metric, we see that Algorithm [T] fails gracefully 
as k increases. 

We also use the recovery metric of average fraction of detection errors to describe the power re- 
quirements of the active users in Figure [3] This figure shows that the power requirement, £i described 
in Theorems [T] and [2j is overly restrictive. Specifically, the rightmost point of the horizontal axis at 
£i\hi\^ = 31dB provides the reference point as the power seen at the receiver as required by ( [T0| ). The 
figure shows much less power is needed for recovering T. It also shows that the required power is not a 
function of k which is exactly in line with the results of our theory. 



B. Discussion 

In order to place our results in context, we note that k < N/ log M scaling has also been suggested in 
13 for the case of MUD in synchronous on-off RACs using the lasso and random Gaussian codewords. 
Here, however, we provide non- asymptotic results for the more general asynchronous case, in contrast 
to the asymptotic results in fT\. Furthermore, we provide guarantees that can be applied to arbitrary user 
codewords. For the codewords studied in Theorems [3] and |4] we have established that the MUD scheme 
for asynchronous on-off RACs has the ability to achieve roughly the same (non-asymptotic) scaling of 
the system parameters k, N, and M as that reported in 121 for the ideal case of synchronous channels. 



With regard to the deterministic codewords introduced in Section IV-B our construction is representa 



tive of a larger class of deterministic matrices derived from cyclic codes. We consider a particular cyclic 
code where the codewords are obtained by evaluating quadratic forms at elements of the field GF(2"^). 
The worst case coherence /x(X) of the expanded codebook matrix is determined by the minimum weight 
of the code and we bound this quantity by elementary methods in Appendix |C] We note that Yu and 
Gong ITSl have calculated the exact weight distribution of a very similar code using more sophisticated 
methods from symplectic geometry. 

Beyond the application to MUD for RACs, our results can also be related to work on model selection. 
Most directly, our work builds on the model selection theory of Candes and Plan ||6l for the lasso. As in 
this paper, [6] provides guarantees for lasso that are based on worst-case coherence /i(X) and spectral 
norm ||X||2. However, a key assumption in Q requires that the vector (3 be "generic" in the sense that 
its support is uniform over its (r + 1)M elements. In this paper, however, we assume a much different 
model: the support of /3 is uniformly random over blocks of elements. In this light, the work here is 
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related to recent work on block-sparse signals such as |[T6ll which considers block-sparse signal recovery 
using a variant of orthogonal matching pursuit as opposed to the lasso. However, as work in the context 
of signal recovery rather than model selection, the work in |[T6l is not directly concerned with estimating 
X and cannot be applied to the MUD problem in RACs. 

As a study of sparse signal recovery using a structured measurement matrix, this work relates to that 
of Romberg and Neelamani tTJ. Though ||71 considers a different application and is concerned with signal 
recovery rather than estimating X, it studies Toeplitz-block matrices that are similar in structure to X. The 
approach in ||7|, however, differs from ours since they provide recovery guarantees based on the restricted 
isometry property (RIP) of the matrix. By working with the RIP, the analysis is particular to randomly 
generated Toeplitz columns. In contrast, here we provide guarantees for any matrix X with sufficiently 
small /x(X) and ||X||2. Subsequently, we give both randomly generated and deterministic codeword 
designs satisfying the requirements. Furthermore, our work provides support set recovery guarantees — in 
the spirit of model selection ifTTl — ^rather than bounds on recovered signal error guaranteed by RIP. 

Finally, in terms of the application of our theory in the real-world, we note that Theorems [1]-^ provide 
non-asymptotic bounds on k and M that guarantee recovery of the set of active users. However, we have 
not shown that these bounds are tight. Indeed, numerical experiments show that the bounds are somewhat 
loose in practice. Nonetheless, the theory provides useful scaling relationships with the metrics /u(X) and 
II X II 2 which, as we have demonstrated, can guide non-orthogonal codeword designs in practical systems. 

We conclude this section by pointing out three key directions of future work in the context of random 
access within asynchronous network settings. One of these directions involves modifying Algorithm [T] to 
allow for a small fraction of missed detections at the expense of reducing the fraction of false positives. 
The second direction involves investigating tight converses of Theorems [T] and |2] in terms of k, N, and 
M. The last direction involves extending Theorems [T] and |2] under the assumption of multipath in the 
uplink. Given the structured nature of the problem discussed in here, all three of these directions present 
some unique analytical challenges and we expect to address those challenges in a sequel to this work. 

VI. Conclusion 

In this paper, we described a novel scheme for MUD in RACs that allows for the user codewords 
to be received asynchronously at the receiver. We leveraged and generalized sparse signal theory to 
provide recovery guarantees for a lasso-based algorithm to find the set of active users. While our results 
are general and applicable to arbitrary sets of codewords, we specialized them to two specific sets of 
codewords, random binary codewords and specially constructed algebraic codewords. 



June 22, 2011 



DRAFT 



23 



The implications of the scaUng behavior outlined in the pairs of inequalities in Theorems [3] and 
[4] are quite positive in the important special case of fixed-bandwidth spread spectrum waveforms and 
a BS serving a bounded geographic region. Specifically, they signify that — for any fixed number of 
temporal signal space dimensions and maximum delay r in the system — the proposed MUD scheme 
can accommodate M < exp(0(A^^/'^)) total users in the case of random signaling and M polynomial 
in N when using our algebraic code design. Both sets of codewords allow k < N/ log M active users 
in the system. This is a significant improvement over the k < M < N scaling suggested by the use of 
classical matched filtering-based approaches to MUD employing orthogonal signaling. 
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Appendix A 

Proof of the Main Result: Arbitrary Delays 

In this appendix, we provide a proof of Theorem [T] Before proceeding further, however, let us develop 
some notation to facilitate the forthcoming analysis. Throughout this appendix, we use to denote the 
block subdictionary of X obtained by collecting the Toeplitz blocks of X corresponding to the indices 
of the active users: Xg [Xj : i G X] . In addition, we use X5 to denote the {N + r) x \1\ submatrix 
obtained by collecting the columns of X corresponding to the nonzero entries of (3, while we use (ig 
to denote the |X| -dimensional vector comprising of the nonzero entries of (3. Finally, we use sgn(-) for 

def 

elementwise signum function: sgn(2) = z/\z\ for any 2: G R. 

The basic idea behind the proof of Theorem [T] follows from the proof of 161 Theorem 1.3]. Specifically, 
using S C {l,...,M(r + l)} to denote the set of the locations of the nonzero entries of (3, we have 

^ def 

from [6j Lemma 3.4] that the lasso solution /3 = /3 + h satisfies h^c = and 

hs = (X^X5)-^[X^w - Xsgn{f3s)] (30) 

if mill |/3j| > 4A and the following five conditions are met: 

ie5 
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. Ci - Invertibility condition: ||(X^X5)"^||2 < 2. 
. C2 - Noise stability: ||(X^X5)-iXjw||oo < A. 

• C3 - Complementary noise stability: ||X^e(I — X5(X^X5)^^X^)w||oo < 
, C4 - Size condition: ||(X|^X5)-isgn(/35)||oo < 3. 

• C5 - Complementary size condition: ||X^eX5(X^X5)^^sgn(/35)||oo < \- 

Furthermore, it trivially follows in this case that the set of non-zero elements of (3 is S, which guarantees 
that Z = I. Our goal then is to consider the probability of each one of these conditions not being met 
under the assumptions of Theorem [T] and the proof of the theorem would then simply follow from the 
union bound. 

A. Invertibility Condition 

In order to establish the invertibility condition, we will make use of the following proposition from 

m. 

Proposition 1 i(T8^): Fix q = 2 log (M(r + 1)) and define the block coherence 

^ij(X) =^ max llXfXj - l|i=j}I||2. (31) 

l<l,J<Al 

Then, for EgZ =^ [ElZl^]!/*? and 6 =^ k/M, we have the following bound 



E,||X^Xb - III2 < 20//b(X) log (M(t + 1)) + (5||X||2 + 9 J 6 log (M(r + 1)) (l + r/i(X)) ||X||2. (32) 



We would like to bound (32i via bounds on /xb(X), /i(X) and ||X||2. First, we can use the linear algebra 



fact II • II 2 < \/\\ ■ llill • lloo fT9l on (31_l to show that /xb(X) < (t + l)^(X). Thus, we can rearrange 



the inequalities of (11) and (12) to obtain 



^^""^ - c(r+l)logV(- + l))' ^^^^ 



IIXII2 < and (34) 

" - c(51og(M(T + l))' 

M^) < , fj, ^,,y (35) 
clog [M{t + 1)) 



1 



Substituting these inequalities into (|32]) and choosing c appropriately large yields Eg||XgXg — III2 < \- 
Finally, notice that X5 is a submatrix of Xg and therefore we trivially have yX^X^— 1||2 < ||X^Xg — 
III2. It can then be easily seen from the Markov inequality that 

Pr(||X;^X5 - III2 > 1/2) < 2''(Eg||X^XH - Ilb)"^ 

< (M(t + 1))-''°^' (36) 
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where (a) follows from the fact that Eq||XgXe — I||2 < j- We have now established that ||X^X5||2 E 
(1/2,3/2) with high probability, which implies that 

Fv{Ct) < {M{T + l)y^^°^\ (37) 

B. Noise Stability 

In order to establish the noise-stability condition, we first condition on Ci (the invertibility condition). 
Next, we denote the j-th column of X.s^X.'^'Ks)^^ by Zj and note that 

||(X^X5)-^X^w|U = max Kz,-,w)|. (38) 

i<i<l'5| 

Furthermore, since the noise vector w is distributed as J\f{0, 1), we also have that (zj, w) ~ M{0, ||zj Hi). 
Finally, note that conditioned on Ci, we have the upper bound 

||z,-||2 < ||X5(X^X5)-i2 < V2. 

where the second inequality can be seen by considering the singular value decomposition of X^ along 
with the bound on the singular values from Ci. 

The rest of the argument now follows easily from bounds on the maximum of a collection of arbitrary 
Gaussian random variables. Specifically, it can be seen from the previous discussion and a real-valued 
version of lITTl Lemma 6] that 

Pr (||(XiX5)-iX^w|U > V2t\C,) < 



We substitute t = A/ v 2 in the above expression to obtain 

2Me-^'/4 1 



M(t + 1) J27rlog(MV^Tl ) 



Pr(C2lCi) < , ^ (39) 



Summarizing, we have that the noise stability condition satisfies 

1 

M(t + l)-^27rlog(MV^Tl) 

C. Complementary Noise Stability 

In order to establish the complementary noise-stability condition, we use ideas similar to the ones 

def 

used in the previous section. To begin with, we again condition on the event Ci and use Px^ = 
X5(X^X5)^^X^ to denote the orthogonal projector onto the column span of X^. Next, we use Zj to 
denote the j-th column of (I — Px5)X5o and note that 

||X;^.(I-PxJw|U = max Kzj,w)|. (40) 

i<i<l'5=| 
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Finally, given that Pxs is a projection matrix and the columns of X have unit norm, we have that 

||z,-||2 = ||(I-PxjX5ce,-||2<l, (41) 

where ej denotes the j-th canonical basis vector. 

It is now easy to see that, since (z_,-,w) is also distributed as M{0, HzjlH), we can make use of lITTl 
Lemma 6] to obtain 

2M(t + l)e-*'/2 



Pr(||X:^.(I-PxJw|U>t|Ci)< 

We substitute t = X/ \/2 in the above expression to obtain '^^^^'^'^^^ 

e complemen 

Pr(C3^|Ci) < 



2TTt 



< 



1 



.'T^ ~ A/^/27rlog(Mv^7+T) ' 

Summarizing, we have that the complementary noise stability condition satisfies 

1 



(42) 



Mj2Trlog{My/7^) 



D. Size Condition 



In order to establish the size condition, we first write 

\\{X^Xs)-hgn{Ps)\\oo < \\{iXlXs)-'-l)sgn{(3s)\\oo + \\sgn{(3s) 

= ||((X^X5)-i-l)sgn(/35)|U + l 



(43) 
(44) 



where (a) follows from the triangle inequality and we once again use Zj to denote the j-th column of 



I). Now define A = (XTX^ — I) and condition on the event Ci. Then it follows from 



the Neumann series (cf. E p. 2171]) that ||zj||2 < 2||Aej||2. Furthermore, since X^ is a submatrix of 
Xg, we have ||Aej||2 < ||(XgXg — I)ej/||2, where / is such that the j'-th column of Xg matches the 
j-th column of X^. 

def 

Finally, define the diagonal matrix Q = diag(5i, . . . , 6m) with the "random activation variables" {6i} 
on the diagonal and define a new matrix R = Q (gi Ir+i, where denotes the Kronecker product. Next, 
use the notation H (X^X — I) and notice that ||(XgXg — I)ej'||2 = ||RHej"||2, where j" is such that 
the j"-th column of X matches the j-th column of X^. In addition, note that H 
has a block structure that can be expressed as 

Hi,i Hi 2 . . . Hi M 



Hi Ho 



H 



M 



H 



H 



2,1 



H 



2,2 



H 



2,M 



H 



MA 



H 



M,2 



H 



M,M 



(45) 
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where Uij = XfX^ - l{i=j}I, 1 < i,j < M, and Hj = [Hf- ... H^.]^. We now define two 

def def 

blockwise norms on H as follows: ||H||b,i = maxi<j<M ||Hj||2, and ||H||5^2 = ^^^i<i,j<M \\'H.i,j\\2- 
Then it follows from the preceding discussion and the structure of the block matrix H that 

||zj||2 < 2||Aej||2 < 2||RHejv,||2 < 2||RH||ij_i. (46) 



Our next goal then is to provide a bound on ||RH||5 i and for this we resort to 11181 Lemma 5]. 
Proposition 2 ([Wj): For q > 21ogAf and 5 = k/M, we have that 

E,||RH||b,i < 2^-''^\\U\\b,2 + \/^||H||b,i. (47) 

Now notice from the definition of H and || • \\b.2 that ||H||b.2 = f^sO^) <(''" + l)^(X). In addition, 
we have from the definition of H and || • \\b,i that 

||H||b,i< max ||X^Xi||2 + ||I.+i||2 < + ^/^(X)||X||2 + 1 < 2^1 + r/x(X)||X||2, (48) 

l<i<A/ 

where (6) follows from the definition of the spectral norm and the triangle inequality, while (c) mainly 



follows from the fact that ||Xj||2 < a/1 + t^(X) because of the Gersgorin disc theorem [19]. We can 
now fix g = 2 log M and make use of the above bounds to conclude from Proposition [2] that 

Eg||RH||B,i < 4(r + l)/i(X)yiogM + 2^5{1 + r/i(X))||X||2. (49) 



We can now substitute ( [33) and p4| ) into the above expression to obtain Eg||RH||B i < 70 with 



def 4: 2/1 

~ cVlog(Af(r + ^ ^clog(M(r + iyyV^^ clog(M(r + l))- ^^^^ 
In order to establish the size condition, we now define the event £ = {maxx<j<|5| ||zj||2 < 7} and 



make use of the Markov inequality along with ( |46| ) and the preceding discussion to obtain 

-270^^ 



Pr(f^)<7-nEg max ||zj||2l'' < f-E,||RH||B iV < 



7 



def 

Finally, we use Z = maxi<j<|5| \{zj,sgn{Pg))\ and conclude that 



Fr{Z >t)< Fr{Z > t\£) + Pr(£:") < 2Me-*'/2^' + (270/7)^ (51) 

where (d) is a consequence of the Hoeffding inequality and the union bound. The condition is now 
established from ( |43] ) by setting t = 2 in the above expression. Furthermore, set 

V (l + 21og2)logM' ^^^^ 
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which leads to 2Me-2/T' < 2M-2i°g2 ^nd 

3, ^ 2(^ + 2) 

7 - 0.9155c ^ ^ ' 

Therefore, we obtain that Pr(£:^) < (1/2)" < M-2iog2 and thus we have that the size condition does 
not hold with probability at most 

Pr(C||Ci) < 3M"2iog2_ (.54) 

E. Complementary Size Condition 

In order to establish the complementary size condition, we proceed similar to the case of the "size condi- 
tion" and define zj as zj (X^X5)^^X^X5eej. It can then be easily seen that ||X^cX5(X^X5)~^sgn(/35)||oc 
maxi<j<|5c| \{zj,sgn{Pg))\. Now condition on the event Ci and notice that ||zj||2 < 2||X^X5cej II2, 
j = l,...,\S% 

We now define Xgc [Xj : i G X^] and consider the set of indices 71 {j' ■ X^ce^v is a column in Xgc}. 
It is then easy to argue by making use of the notation developed in Section |A-D that if j G 71 then 

||X5X5ceo-||2 < max ||XgXi||2 = IIX^XbcIIb i < ||RH||b,i, (55) 

where (a) follows from the fact that X^Xgc is a submatrix of RH. We therefore have from the discussion 
following Proposition |2] and the Markov inequality that V j € 71 and for g = 2 log M and 7 > 

Pr(||X5X5cej||2 > 7) < < ( — 1 • (56) 

Finally, the argument involving j G 71^^ is a little more involved but follows along similar lines. 
Specifically, fix any j G 71^ and define i' G X to be such that X^ce^ is a column of X^'. Next, define x^nt' 
to be the column of X5 that lies within the Toeplitz block Xj' and X j/ to be the submatrix constructed 
by removing the column x^ni' from X5. Then, if we use the notation Xg\j/ [Xj : i G B \ {i'}], it 
can be verified that for any j G 71'^ we have 

llX^X^cejII^ = ||X^^.,X5cej||^ + |x^ni'^'S<=ejp 
< max 1 1 Xgy , X 1 1 2 + (X) 



< ||RH||^,i + ^^(X), (57) 

where (6) again makes use of the fact that the spectral norm of a matrix is an upper bound for the 
spectral norm of any of its submatrices. We therefore once again obtain from the discussion following 
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Proposition |2] and the Markov inequality that \/ j ^ Ti and for g = 2 log M and 7 > 

Pr(||X^X5ee, ||2 > 7) < Pr (||RH||b,i > Vl' - fi^^X)) < |^^==^=^ . (58) 

We can now define the event £ = { ||X^X5cej||2 < 7} and use the notation Z maxi<j<|5c| \{zj,sgn{f3g)) \ 



to conclude from (56 1 and (58 1 that 



Pr(Z > t) < Pr(Z > t\£) + Pr(f ^) 

< 2M(r + l)e-*V27^ + (70/7)" + (70/ V^^^T^)", (59) 

where (c) follows from the Hoeffding inequality and the union bound. The condition is now established 
by setting t = ^ in the above expression. Furthermore, set 

7 = ^ (60) 

V32(l + 21og2)log(M(r + l)) 

which yields 2M(r + l)e-i/327^ < 2(M(t + l))-2iog2 ^^^^ 70 < ^^+! < ^ /2_ Therefore, 
we obtain that Pr(£:^) < 2(70/^7^ - /i^)^ < 2(1/2)'? < 2(M(r + l))-2i°g2 and thus we have that the 



size condition satisfies 



Pr(C^|Ci) < 4(M(t + 1))"^'°^^ (61) 



F. Proof of Theorem [7] 

The proof of Theorem [T] follows from the preceding discussion by taking a union bound over all 
the respective conditions and removing the conditionings: Pr((Ci n C2 n C3 n C4 n Cs)'^) < Pr(C[) + 
Pr(q|Ci) + Pr(q|Ci) + Pr(C||Ci) + Pr(Cf |Ci). Consequently, we obtain that the probability of error 
is upper bounded by 2Af-i(27rlog(M/7^TI))"^^^ + 5(M(r + i))-2^°s2 ^ 3^-2 log 2_ 

Appendix B 
Proof of the Main Result: Random Delays 

In this appendix, we provide a proof of Theorem [2] The proof parallels that of Theorem [TJ thus the 
definitions and notation in Appendix [A] are reused. Key to the proof is the distribution and generation of 
the support set S, which we examine first. 

As described in Section [2| here we consider the case when Tj are uniformly selected from {0, . . . , r} 
at random. Translating the notions of users and delays to the block structure of X, the set S can be 
viewed as generated by a two step procedure: (1) blocks are activated with probability b = k/M; (2) 
within each active block, a delay/column is selected uniformly at random. We call this the conventional 



June 22, 2011 



DRAFT 



30 



activation procedure (CAP). However, to prove Theorem |2] it is useful to examine a different activation 
procedure of S as follows: 

1) Let be a set of Bernoulli random variables with Pr((5j = 1) = ^^^^^-^^ 5. Set S to 
be the set {i : 6i = 1}. 

2) Mapping indices to the block structure on X, prune 5 to 5: For each block with more than one 
active element in S, select a single element uniformly at random among the active elements in the 
block. 

We call this the equivalent activation procedure (EAP) and we now argue that, with an appropriate value 
of p, the set S is distributed identically to that generated using the conventional procedure. Of particular 
utility will be the set S since S D S and S is generated simply from iid Bernoulli variables. We further 
define Si to be S restricted to elements in block i. 

The value of p needed can be calculated by requiring the probability of block activity to be equal 
under the CAP and the EAP. That is, for any i = 1, . . . , M, 

kp k 



Pr[5.>l] = l-Pr[5. = 0] = l-^1-^^^J = -, (62) 
where the last equality links the two procedures. Solving for p gives 

When k <^ M, (l — ^ 1 — and p w 1 as expected. This approximation will be made 



more explicit later in ( |67) . 

To prove equivalence in distribution between the two methods, it remains to show independence of 
blocks and uniformity among columns in blocks in the EAP. Independence of blocks is inherited from 
the independence of column activation in Step 1 (since the blocks are disjoint sets). We now make a 
symmetry argument to show a uniform selection of columns. 

Let be an arbitrary column/block pair. Let y be the event that is activated in Step 1 and 
let X be the event that (i, j) is selected in Step 2. Since the events satisfy X C y C {\Sj\ > 0}, we can 
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write 



Ft[x] = Fi[xnyn{\Sj\ > o}] 

T+l 



J^^PrfA-no^nllcSjl = n}] 

n=l 

T+l 

J]Pr[A'|3;n{|5,-|=n}]Pr[3^n{|4- 

n=l 
r+1 

J]-Pr[3;n{|5,|=n}], 



n}] 



(64) 



n=l 

where the last equaUty is due to the uniform selection in Step 2. Now, for n = 1, 

Pr[3^ n {\Sj\ = n}] = Pr \{\Sj\ = n} y] Fr[y] 

kp 



, r + 1, we have 




, T— n+l / \ ra— 1 




kp 


j V(r + l)Mj 




_(r + 1)M_ 



(65) 



T—n+l 



kp 



(r + 1)M 



where the first factor is Binomial over the r remaining columns given that i,j was selected in Step 1. At 



this point, it is sufficient to note that (65 1 is not a function of our choice of Thus, by symmetry, 

the probability is equal for all columns. Nonetheless, we complete the calculation to show it takes the 



anticipated value. Returning to (|64]), we have, 

T+l 



T—n+l 




kp 



r + 1 M 



T— n+l 



kp 



(r + l)M 



r + 1 M' 

where in the second equality we use a simple identity on Binomial coefficient, in the third equality we 
note the sum is nearly complete over a Binomial distribution function and lastly, we use (|62]). 

Having shown the equivalence between CAP and EAP, we aie ready to prove the five conditions Ci 
through C5 that guarantee recovery. While our model on the users corresponds to the CAP, we will use 
the EAP in the remainder of the proof. Since S is formed from iid random variables, we are able to 
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follow a proof technique similar to that of |j6l. We include our proof for completeness since our theory 
is based on slightly different assumptions and aspects of our proof use different methods. 

A. Invertibility Condition 

To bound || (X^X^)^^ II2 we consider S and S as generated from EAR Since S is uniformly distributed 
over possible column selections, we can bound ||(X|X5-)-i||2 using methods of |61 where, using |i20j| 
and Q = 21og (M(r + 1)), we have 



E,||XTX, - lib < 30MX) log(M(r + 1)) + ,^^^M30M^L±M. 

We would like to translate this into conditions similar to ( |33] ) and ([34]). To do so, we make the approx- 
imation noted below ( [63] ) explicit. We will assume k/M < 1/4 here, which follows trivially from the 



condition ( 14 1 in the theorem. This assumption allows us to make the following approximation. 



1 _ (1 _ fc/M)i/(^+i) < / (I - 1/4)^-^ < / (67) 
^ ' ' - M{T + iy ' ' -M(t + 1)3 

The first inequality is an application of Taylor's remainder theorem on the function f{x) = 1 — (1 — 

2;)i/{^+i)^ while the second inequality is due to the fact that (1 — e)^ < 1 for e > 0. Applying this 

approximation to ( |63] ) yields p < 4/3. 

Thus, if 

m(X) < , ^ .^l. — and (68) 

'^^ ' ~ c'log(M(r + 1)) 



2 < M{t + 1) ^ M{t + 1) 
' "2 - c'fclog(M(r + 1)) - c"A;plog(M(T + 1))' ^ ^ 



then Eq||X^X^ — lib < 1/4. Above, c' and c" are appropriately chosen constants independent of the 



problem parameters. Since 5 C 5, we have yX^X^ — 1||2 < II^T^^ ~ lib and therefore E^HX^X^ 



I|b ^ 1/4- Following the calculations in Appendix |A-A cf. (36l-(37i, this gives 



Pr(CJ) < (M(t + 1))'''°^'. (70) 



B. Noise Stability and Complementary Noise Stability 



Conditions C2 and C3 follow when conditioned on Ci in an identical manner to Appendix A-B and 



Appendix A-C with probabilities B9j and ([42]), respectively. 
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C. Size Condition 
For C4 we begin as in Appendix |A-D with the following upper bound 



\\{X'^Xs)-hgn{f3s)\\oo < nmx |(z„sgn(/35))| + 1, 

1<J<I'S| 

where Zj denotes the j-th column of ((X^X^)^^ — l). Using the definitions from Section 
additionally defining A = X^^X 5 — I, and conditioning on Ci , we have 

llzjib < 2||Aej||2 < 2||Aej'||2, 



A-D 



and 



(71) 



where the first inequality is due to an application of the Neumann series and Ci. The second inequality 
is due to A being a sub-matrix of A and the choice of j' such that the column X^e^ is the same as 



X^-e^v. 



def 



Next we define R = diag(5i, . . . , (5m(t+i)) as a selection matrix for the EAP similar to R so that 
X^ = RX (conforming to the first step in the EAP). With this definition we have 

||Aejv||2 < ||RH||i^2, (72) 

where || • ||i-s>2 denotes the maximal column norm as defined in ll20l . 

Since {Si} are iid Bernoulli random variables, we can apply ll20l Theorem 3.2] which gives, for 
g = 21og(M(r + l)), 



Eg||RH||i^2 < 2^-' Vlog(M(r + l))/i(X) + V5||H||i^2 



We can bound ||H||i_!.2 as follows: 

||H||i^2 



max ||(X X — I)ej||2< max 

l<i<J\/(T+l) l<i<M(T + l) 



< IIXII2 + I < 2||X||2 
where we use 1 < ||X||2 since the columns have unit norm. This gives 

E,||RH||i^2 < 2^-'^''^log{M{T + l))fi{X) + 2yi||X||2. 



Upon substituting in the values from ( [33| ) and ( [34| ), we obtain 

E,||RH||i^2 < 70 

with 



70 



v/log(M(r + 1)) L c 



+ 



(73) 



(74) 
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In order to establish the size condition, we now define the event £ = {maxi<j<|5| ||zj||2 < 7} and 

;ding discussion 



make use of the Markov inequality along with (71 1, (72) and the preceding discussion to obtain 



def 



Kg max llzJU 

i<j<\s\ 



< f-EJ|RH||i_,2 



Finally, we use Z = maxi<j<|5| \{zj,sgn{f3g))\ and conclude that 



(a) 



Pr(Z > t) < Pr(Z > t\£) + Fi{£^) < 2M(r + l)e-^"/^^" + (270/7)9 



(75) 



(76) 



where (a) is a consequence of the Hoeffding inequality and the union bound. The condition is now 
established from ( [43] ) by setting t = 2 in the above expression. Furthermore, set 

I 9 

(77) 



(l + 21og2)log(M(T + l))^ 



which leads to 2M(t + l)e-^/^' < 2 (M(r + 1))" 



and 



70 < (2^-^^ + 2V^)(l + 21og2) ^ ^ 
7 ~ c 



(78) 



Therefore, we obtain that Pr(^:'=) < (l/2)« < (M(r + l))-2iog2 ^j^^ ^1^^^ 

we have that the size condition 

does not hold with probability at most 

Pr(C||Ci) < 3(Af(r + l))-2^°s2^ 



D. Complementary Size Condition 
As in Appendix 



A-E 



we define Zj as Zj 



def 



X|5X5)^^X^X5oej. It can then be easily seen that 
||X^,X5(X^X5)~-^sgn(/35)||oo = niaxi<j<|5c| |(zj, sgn(/35))|. Now condition on the event Ci and 
notice that ||zj||2 < 2||X^X5cej||2, j = 1, . . . , \S'^\- 
We then see that 

||X;^X5ee,-||2 < ||X;^X5e||i^2 < ||RH||i^2, (79) 
where (a) follows from the fact that X^X^c is a submatrix of RH. As in the previous subsection, by 



def 



redefining the event £ = {maxi<j<|5c| ||zj||2 < 7} and Z = maxi<j<|5c| |(zj, sgn(/35))| and using 



( [73| ), ( [75] ) and ( [76| ) hold once again. 

In accordance with the complementary size condition, we take t = 1/4 and set 



1 



7 



32(l + 21og2)log(M(T + l)) 

so that 2M(r + l)e"^ = 2(M(r + l))-2ios2 and 70/7 < 1/4. This gives us 

Pr(Cg|Ci) <3(M(r + l))-2i°g2_ 



(80) 
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E. Proof of Theorem |2] 

The proof of Theorem [2] follows from the preceding discussion by taking a union bound over all 
the respective conditions and removing the conditionings: Pr((Ci n C2 n C3 n C4 n C^Y) < Pr(Cf) + 
Pr(q|Ci) + Pr(q|Ci) + Pr(C||Ci) + Pr(q|Ci). Consequently, we obtain that the probability of error 
is upper bounded by 2M~i(27rlog(M/rTI))"^''^ + 7(M(r + 1))'^^°^^ 

Appendix C 
Proof of Lemma[3] 

In order to bound the sum of ( [25] ), we will use the following propositions. 
Proposition 3: For x, y G GF(2'") and i = 1, 2, . . . 

j=0 

Proof: We prove this by induction and application of (x + y)^ = x^' + That is, we first note 
that the lemma holds for i = 1 and, assuming true for i, we have 

(x + yf^^+i = (x + yf (x + y)2'+i 

= (x + yr w'^' + y^'^' + Y.^xyr^^ + yT-^'"'^'] 

j=0 

= ix^' + y2')(x2-+i + y2'+i) + ^^{xyfix + y)2'--2-Hi 

j=0 

= x^'^^+i + y2*"^+i + (xyf {x + y) + Y^ixyf (x + yf''-''''^'. 

j=0 

Incorporating the middle term in the sum completes the proof. ■ 
Proposition 4 ( ^T4\ pp. 278-9]): The quadratic polynomial x^ + /x + y with coefficients in GF(2'") 

and / / has two distinct roots in GF(2'^) if Tr(y//2) = and no roots in GF(2"^) if Tt{g/f) = 1. 
Proposition 5: (a) The cardinality of {g G GF(2'") : Tr(ay) = ci} is 2™-i for a G GF(2™),a / 

and ci G GF(2). (b) The cardinality of {g G GF(2™) : Tv{ag) = ci,Tr(/3y) = 02} is 2""'^ for 

P G GF(2™), ^ / a, /3 / and C2 G GF(2). 

Proof: Let {rji}^^ and {Aj}^^ be dual bases of GF(2'") |[T4l p. 117] and consider a and g in these 

bases respectively as a = aiiji + • • • + amVm and g = 71A1 + • • • + jm^m for aj,7j G GF(2). Then 

Tr(ay) = ai7i + • • • + a„i'ym = ci is a restriction of a single degree of freedom in selecting {7i}^i. 

Similarly, Tr(/3y) = C2 restricts an additional degree of freedom. ■ 
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We are now ready to bound the sum given by p5| ). We will begin with a simple (yet required) case 
which illustrates our use of Proposition [5] Suppose the non-zero vector a is zero everywhere but at ckq. 
In this case we have 

S = ^ (_l)Tr(ao2:) ^ ^ (^_-|^^Tr(aoa;) _ 
xeGF*(2'") a'eGF(2") 

where we've completed the sum to be over all of GF(2'"). By Proposition |5] (a), Tr(ao2;) = 1 for 
precisely presicely half the 2"* terms of the sum. Thus, the sum is and \S\ = 1. For the remainder of 
the proof, we will assume that Ui is non-zero for some i > 1. 

Considering the square of p5] ), by using the linearity of the trace we have 

5-2 ^ ^ (-_l)Tr[ao(x+y)+ELi«,(a;=* + Hy^" + i)] 

a;eGF*(2™)?;eGF*(2"') 
= 2™-l+ (_l)Tr[ao(x+2/)+ELia.(^'' + H2/''+^)] 



a;eGF*(2")yeGF*(2'") 



= 2^-1+ Y (_l)Tr[ao(x+y)+ELia.((^+3/)''+'+Eilo(^?^)''(^+3/) 
xeGF'(2")yeGF*(2") 

In the last equality we have used Proposition |3] so that we may apply the change of variables given by 
f = X + y and g = xy. To justify this substitution we note that 

{(x + y,xy) : X G GF*(2-),y G GF*(2-),y ^ x} 

= {{f,g) : f G GF*(2-),g G GF*(2-), Tr(g//2) = 0}. 

To see this, consider quadratics {z+x){z+y) = z'^ + fz+g with non-zero roots. The first set generates all 
quadratics with two solutions by enumerating the roots while, by Proposition |4j the second set generates 
the same quadratics by enumerating the coefficients. Since with this substitution both {x, y) and (y, x) 
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map to (/, g) we account for the extra factor of 2 below. We now have 

52 ^ 2™ - 1 + 2 Y (-l)^'[°°^+^'=i"'(^'"^'+^5;"^''^''"'''''^')] 

/,9eGF*(2-) 
Tr{g/p)=0 

= 2--l + 2 Y (-1)^''("«^+^*-"'^''''') Y (-if'-fe-^-o"'^''^''"'^'"') 

/eGF*(2'") geGF*(2'") 

Tr(9/r)=0 

= 2™- 1 + 2 Y (-l)^'("°-^+^'=i"-^''^') [ J]] (_i)Tr(ELiE}ii".9^'r'-^'+'^ 



/eGF*(2'") geGF{2") 

Tr(s//^)=0 

<3(2-_l) + 2 5] (_l)Tr(a„/+EU".r+0 J] (_l)Tr(EUE5=i«.5=V^--^-'^+0 

/eGF*(2-) geGF(2'") 

Tr(5//^)=0 

(81) 

where, in the third equality, we've completed the sum in g to include 5 = 0. The resulting subtraction 
creates a sum over / which we trivially bound by 2"^ — 1. Turning our attention to the innermost sum 
over g, we will show that the sum is either or 2™~^. Further, we will bound the number of / for which 
it is not zero. 

To separate g, we can use the linearity of the trace and Tr(3;) = Tr(a;^ ^) for each j and rewrite the 
exponent of the inner sum as Tr[(^*^^ ^*~q ^ f'^' ~^)5'] = Tr(r/5) where we've introduced 
Tj S GF(2'") to simplify notation. Suppose, for a fixed /, there exists some g with Tr{g/ f'^) = such 
that (-l)Ti-{r/s) = _i Then we must have Tf / 1/p and Tf / 0. In this case. Proposition [s] tells 
us that the inner sum of ( [81] ) evaluates to since part (a) gives the size of the sum while (b) shows 
precisely half the terms take value (-1). Thus, we are interested in when (— 1)"'"''('"/9) maps all of the 
subset {g G GF(2'") : Tr{g/p) = 0} to 1. 

When (-l)Tr(r/!?) a trivial map of the subset, we have {g : Tr(r/c/) = 0} ^ {g : Tr{g/f) = 0} 
which provides two cases. The first is that Tf = and the above inclusion is strict. In second case, when 
Tf 0, the sets have same cardinality by Proposition [5] and, thus, the two sets are equal. In this case, by 
Tr[(F/ + l/p)g] = V5, the non-degeneracy of the trace [21 . Proposition 28.87] tells us F/ = 
In both cases. Proposition [s] gives the size of the inner sum of (8]_l as 2'"^^. The task now becomes to 
bound the number of / for which each of these cases occur. 

Tf = defines the following polynomial in /: 

= ^ J]af"72-+2--2 ^ ^^^2'— ^2'+— +2*— -2' 
i=l j=0 i=l j=0 
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where, in the second equaUty, we've used (x+y)^* ^ = x^^ ^ ^ to ensure the powers of / are positive 
integers. The degree of this polynomial is at most 2^*^^ + 2*~^ — 2* and thus we at most 2^*^^ + 2*~^ — 2* 
roots at which = 0. 

The case for /^Fj = 1 is similar and follows the same steps on a slightly different polynomial. In 
this case we find there are at most 2^*^^ + 2*^^ values of / for which T f = 1//^. Combining the two 
cases, we find that there are at most 2^* values of / for which (— is a trivial map over the 
sum. Returning to (8]_l, we've found the sum in / have terms with values of either or ±2™^^ with the 
non-zeros terms occurring at most 2^* times. Thus, 

^2 ^ 3^2*^ - 1) + 2 X 2^* X 2"-^ < 2'"+^*+^ 

Taking the root gives the result. 
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